perm filename CHAP4[4,KMC]20 blob
sn#054922 filedate 1973-07-24 generic text, type T, neo UTF8
00100 LANGUAGE-RECOGNITION PROCESSES FOR UNDERSTANDING CONVERSATION
00200 IN TELETYPED PSYCHIATRIC INTERVIEWS
00300
00400 Since the behavior being simulated by this paranoid model is
00500 the sequential language-behavior of a paranoid patient in a
00600 psychiatric interview, the model must have an ability to interpret
00700 and respond to natural language input sufficient to demonstrate
00800 conduct characteristic of the paranoid mode. By "natural language"
00900 I shall mean ordinary American English such as is used in everyday
01000 conversations. It is still difficult to be explicit about the
01100 processes which enable humans to interpret and respond to natural
01200 language. (A mighty maze ! but not without a plan - A. Pope).
01300 Philosophers, linguists and psychologists have investigated natural
01400 language with various purposes. Few of the results have been useful
01500 to builders of interactive simulation models. Attempts have been
01600 made in artificial intelligence to write algorithims which
01700 "understand" teletyped natural language expressions. (Colby and
01800 Enea,1967; Enea and Colby,1973; Schank,1973; Winograd,1973; Woods,
01900 1970). Computer understanding of natural language is actively being
02000 attempted today but it is not something to be completly achieved
02100 today or even tomorrow. The problem at the moment is not to find
02200 immediately the best way of doing it but to find any way at all.
02300 During the 1960's when machine processing of natural language
02400 was dominated by syntactic considerations, it became clear that
02500 syntactical information alone was insufficient to comprehend the
02600 expressions of ordinary conversations. A current view is that to
02700 understand what is said in linguistic expressions, knowledge of
02800 syntax and semantics must be combined with beliefs from a conceptual
02900 structure capable of making inferences. How to achieve this
03000 combination efficiently with a large data-base represents a
03100 monumental task for both theory and implementation.
03200 For practical reasons we did not attempt to construct a
03300 conventional linguistic parser to analyze conversational language of
03400 interviews. Parsers to date have great difficulty in assigning a
03500 meaningful interpretation to the expressions of everyday
03600 conversational language using unrestricted English. Purely syntactic
03700 parsers offer a cancerous proliferation of interpretations. A
03800 conventional parser lacking ignoring mechanisms,may simply halt when
03900 it comes across a word not in its dictionary. Parsers represent tight
04000 conjunctions of tests instead of loose disjunctions needed for
04100 gleaning a some meaning from everyday language communication which
04200 may involve misunderstandinga and ununderstandings. People
04300 misunderstand and ununderstand at times and thus we remain partially
04400 opaque to one another.
04500 The language-recognition process utilized by the model first
04600 puts the teletyped input in the form of a list and then determines
04700 the syntactic type of the input expression - question, statement or
04800 imperative. The expression-type is scanned to form a
04900 conceptualization, i.e. a pattern of contentives, the stress-forms of
05000 speech having conceptual meaning. The resultant conceptual pattern
05100 contains no function or closed-class terms (articles, auxiliaries,
05200 conjunctions, prepositions, etc.) except as they might represent a
05300 component in a contentive word-group. For example, the word-group
05400 (for a living) is defined to mean `work' as in "what do you do for a
05500 living?" The conceptualization is classified according to the rules
05600 of Fig. 1 as malevolent, benevolent or neutral.
05700 (INSERT FIG.1 HERE)
05800 How language is understood depends on the intentions of the
05900 producers and interpreters in the dialogue. Thus language is
06000 understood in accordance with a participant's view of the
06100 situation. Our purpose was to develop a method for understanding
06200 sequences of everyday English sufficient for the model to communicate
06300 linguistically in a paranoid way in the circumscribed situation of a
06400 psychiatric interview. Such an interview is not small talk; a job is
06500 to be done.
06600 We did not try to construct a general-purpose algorithm which
06700 could understand anything said in English by anybody to anybody in
06800 any dialogue situation. (Does anyone believe it possible?) We sought
06900 only to extract, distill or cull an idiosyncratic, idiolectic meaning
07000 or even a gist of a meaning from the input.
07100 Natural language is not an agreed-on universe of discourse
07200 such as arithmetic wherein symbols have the same meaning for everyone
07300 who uses them. What we loosely call "natural language" is actually a
07400 set of history-dependent idiolects, each being unique to the
07500 individual with a unique history. To be unique does not mean that no
07600 property is shared with other individuals, only that not every
07700 property is shared. It is the broad overlap of idiolects which allows
07800 the communication of shared meanings in everyday conversation.
07900 We took as pragmatic measures of "understanding" the
08000 ability (1) to form a conceptualization so that questions can be
08100 answered and commands carried out, (2) to determine the intention of
08200 the interviewer, (3) to determine the references for pronouns and
08300 other anticipated topics. This straightforward approach to a complex
08400 problem has its drawbacks, as will be shown, but we strove for a
08500 highly individualized idiolect sufficient to demonstrate paranoid
08600 processes of an individual in a particular situation rather than for
08700 a general supra-individual or ideal comprehension of English. If the
08800 language-recognition processes interfered with demonstrating the
08900 paranoid processes, we would consider it defective and insufficient
09000 for our purposes.
09100 Some special problems a dialogue algorithm must handle in a
09200 psychiatric interview will now be outlined along with a brief
09300 description of how the model deals with them.
09400
09500 .F
09600 QUESTIONS
09700
09800 The principal expression-type used by an interviewer consists
09900 of a question. A question is recognized by its beginning with a wh-
10000 or how form and/or the expression ending with a question-mark. In
10100 teletyped interviews a question may sometimes be put in declarative
10200 form followed by a question mark as in:
10300 .V
10400 (1) PT.- I LIKE TO GAMBLE ON THE HORSES.
10500 (2) DR.- YOU GAMBLE?
10600 .END
10700 Although a question-word or auxiliary verb is missing in (2), the
10800 model recognizes that a question is being asked about its gambling
10900 simply by the question mark.
11000 Particularly difficult are those `when' questions which
11100 require a memory which can assign each event a beginning, an end and
11200 a duration. An improved version of the model should have this
11300 capacity. Also troublesome are questions such as `how often', `how
11400 many', i.e. a `how' followed by a quantifier. If the model has "how
11500 often" on its expectancy list while a topic is under discussion, the
11600 appropriate reply can be made. Otherwise the model fails to
11700 understand.
11800 In constructing a simulation of symbolic processes it is
11900 arbitrary how much information to represent in the data-base, Should
12000 the model know what is the capital of Alabama? It is trivial to store
12100 a lot of facts and there always will be boundary conditions. We took
12200 the position that the model should know only what we believed it
12300 reasonable to know relevant to a few hundred topics expectable in a
12400 psychiatric interview. Thus the model performs poorly when subjected
12500 to baiting `exam' questions designed to test its informational
12600 limitations rather than to seek useful psychiatric information.
12700
12800 .F
12900 IMPERATIVES
13000
13100 Typical imperatives in a psychiatric interview consist of
13200 expressions like:
13300 .V
13400 (3) DR.- TELL ME ABOUT YOURSELF.
13500 (4) DR.- LETS DISCUSS YOUR FAMILY.
13600 .END
13700 Such imperatives are actually interrogatives to the
13800 interviewee about the topics they refer to. Since the only physical
13900 action the model can perform is to `talk' , imperatives are treated
14000 as requests for information. They are identified by the common
14100 introductory phrases: "tell me", "lets talk about", etc.
14200 .F
14300 DECLARATIVES
14400
14500 In this category is lumped everything else. It includes
14600 greetings, farewells, yes-no type answers, existence assertions and
14700 the usual predications.
14800
14900 .F
15000 AMBIGUITIES
15100
15200 Words have more than one sense, a convenience for human
15300 memories but a struggle for language-understanding algorithms.
15400 Consider the word "bug" in the following expressions:
15500 .V
15600 (5) AM I BUGGING YOU?
15700 (6) AFTER A PERIOD OF HEAVY DRINKING HAVE YOU FELT BUGS ON
15800 YOUR SKIN?
15900 (7) DO YOU THINK THEY PUT A BUG IN YOUR ROOM?
16000 .END
16100 In expression (5) the term "bug" means to annoy, in (6) it
16200 refers to an insect and in (7) it refers to a microphone used for
16300 hidden surveillence. The model uses context to carry out
16400 disambiguation. For example, when the Mafia is under discussion and
16500 the affect-variable of fear is high, the model interprets "bug" to
16600 mean microphone. In constructing this hypothetical individual we
16700 took advantage of the nature of idiolects which can have an arbitrary
16800 restriction on word senses. One characteristic of the paranoid mode
16900 is that no matter in what sense the interviewer uses a word, the
17000 patient may idiosyncratically interpret it in some sense of his own.
17100 This property is obviously of great help for an interactive
17200 simulation with limited language-understanding abilities.
17300 .F
17400 ANAPHORIC REFERENCES
17500 The common anaphoric references consist of the pronouns "it",
17600 "he", "him", "she", "her", "they", "them" as in:
17700 .V
17800 (8) PT.-HORSERACING IS MY HOBBY.
17900 (9) DR.-WHAT DO YOU ENJOY ABOUT IT?
18000 .END
18100 When a topic is introduced by the patient as in (8), a
18200 number of things can be expected to be asked about it. Thus the
18300 algorithm has ready an updated expectancy-anaphora list which allows
18400 it to determine whether the topic introduced by the model is being
18500 responded to or whether the interviewer is continuing with the
18600 previous topic.
18700 The algorithm recognizes "it" in (9) as referring to
18800 "horseracing" because a flag for horseracing was set when horseracing
18900 was introduced in (8), "it" was placed on the expected anaphora list,
19000 and no new topic has been introduced. A more difficult problem arises
19100 when the anaphoric reference points more than one I-O pair back in
19200 the dialogue as in:
19300 .V
19400 (10) PT.-THE MAFIA IS OUT TO GET ME.
19500 (11) DR.- ARE YOU AFRAID OF THEM?
19600 (12) PT.- MAYBE.
19700 (13) DR.- WHY IS THAT?
19800 .END
19900 The "that" of expression (13) does not refer to (12) but to
20000 the topic of being afraid which the interviewer introduced in (11).
20100 Another pronominal confusion occurs when the interviewer uses
20200 `we' in two senses as in:
20300 .V
20400 (14) DR.- WE WANT YOU TO STAY IN THE HOSPITAL.
20500 (15) PT.- I WANT TO BE DISCHARGED NOW.
20600 (16) DR.- WE ARE NOT COMMUNICATING.
20700 .END
20800 In expression (14) the interviewer is using "we" to refer to
20900 psychiatrists or the hospital staff while in (16) the term refers to
21000 the interviewer and patient. Identifying the correct referent would
21100 require beliefs about the dialogue itself.
21200
21300 .F
21400 TOPIC SHIFTS
21500
21600 In the main a psychiatric interviewer is in control of the
21700 interview. When he has gained sufficient information about a topic,
21800 he shifts to a new topic. Naturally the algorithm must detect this
21900 change of topic as in the following:
22000 .V
22100 (17) DR.- HOW DO YOU LIKE THE HOSPITAL?
22200 (18) PT.- ITS NOT HELPING ME TO BE HERE.
22300 (19) DR.- WHAT BROUGHT YOU TO THE HOSPITAL?
22400 (20) PT.- I AM VERY UPSET AND NERVOUS.
22500 (21) DR.- WHAT TENDS TO MAKE YOU NERVOUS?
22600 (23) PT.- JUST BEING AROUND PEOPLE.
22700 (24) DR.- ANYONE IN PARTICULAR?
22800 .END
22900 In (17) and (19) the topic is the hospital. In (21) the topic
23000 changes to causes of the patient's nervous state.
23100 Topics touched upon previously can be re-introduced at any
23200 point in the interview. The model knows that a topic has been
23300 discussed previously because a topic-flag is set when a topic comes
23400 up.
23500
23600 .F
23700 META-REFERENCES
23800
23900 These are references, not about a topic directly, but about
24000 what has been said about the topic as in:
24100 .V
24200 (25) DR.- WHY ARE YOU IN THE HOSPITAL?
24300 (26) PT.- I SHOULDNT BE HERE.
24400 (27) DR.- WHY DO YOU SAY THAT?
24500 .END
24600 The expression (27 ) is about and meta to expression (26 ). The model
24700 does not respond with a reason why it said something but with a
24800 reason for the content of what it said, i.e. it interprets (27) as
24900 "why shouldnt you be here?"
25000 Sometimes when the patient makes a statement, the doctor
25100 replies, not with a question, but with another statement which
25200 constitutes a rejoinder as in:
25300 .V
25400 (28 ) PT.- I HAVE LOST A LOT OF MONEY GAMBLING.
25500 (29 ) DR.- I GAMBLE QUITE A BIT ALSO.
25600 .END
25700 Here the algorithm interprets (29 ) as a directive to
25800 continue discussing gambling, not as an indication to question the
25900 doctor about gambling.
26000
26100 .F
26200 ELLIPSES
26300
26400
26500 In dialogues one finds many ellipses, expressions from which
26600 one or more words are omitted as in:
26700 .V
26800 (30 ) PT.- I SHOULDNT BE HERE.
26900 (31) DR.- WHY NOT?
27000 .END
27100 Here the complete construction must be understood as:
27200 .V
27300 (32) DR.- WHY SHOULD YOU NOT BE HERE?
27400 .END
27500 Again this is handled by the expectancy-anaphora list which
27600 anticipates a "why not".
27700 The opposite of ellipsis is redundancy which usually provides
27800 no problem since the same thing is being said more than once as in:
27900 .V
28000 (33 ) DR.- LET ME ASK YOU A QUESTION.
28100 .END
28200 The model simply recognizes (33) as a stereotyped pattern.
28300
28400 .F
28500 SIGNALS
28600
28700 Some fragmentary expressions serve only as directive signals
28800 to proceed as in:
28900 .V
29000 (34) PT.- I WENT TO THE TRACK LAST WEEK.
29100 (35) DR.- AND?
29200 .END
29300 The fragment of (35) requests a continuation of the story introduced
29400 in (34). The common expressions found in interviews are "and", "so",
29500 "go on", "go ahead", "really", etc. If an input expression cannot be
29600 recognized at all, the lowest level default condition is to assume it
29700 is a signal and either proceed with the next line in a story under
29800 discussion or if the latter is not the case, begin a new story with a
29900 prompting question or statement.
30000
30100 .F
30200 IDIOMS
30300
30400 Since so much of conversational language is stereotyped, the
30500 task of recognition is much easier than that of analysis. This is
30600 particularly true of idioms. Either one knows what an idiom means or
30700 one does not. It is usually hopeless to try to decipher what an idiom
30800 means from an analysis of its constituent parts. If the reader doubts
30900 this, let him ponder the following expressions taken from actual
31000 teletyped interviews.
31100 .V
31200 (36) DR.- WHATS EATING YOU?
31300 (37) DR.- YOU SOUND KIND OF PISSED OFF.
31400 (38) DR.- WHAT ARE YOU DRIVING AT?
31500 (39) DR.- ARE YOU PUTTING ME ON?
31600 (40) DR.- WHY ARE THEY AFTER YOU?
31700 (41) DR.- HOW DO YOU GET ALONG WITH THE OTHER PATIENTS?
31800 (42) DR.- HOW DO YOU LIKE YOUR WORK?
31900 (43) DR.- HAVE THEY TRIED TO GET EVEN WITH YOU?
32000 (44) DR.- I CANT KEEP UP WITH YOU.
32100 .END
32200 In people, the understanding of idioms is a matter of rote
32300 memory. In an algorithm, idioms can simply be stored as such. As
32400 each new idiom appears in teletyped interviews,
32500 itsrecognition-pattern is added to the data-base since what happens
32600 once can happen again.
32700 Another advantage in constructing an idiolect for a model is
32800 that it understands its own idiomatic expressions which tend to be
32900 used by the interviewer (if he understands them) as in:
33000 .V
33100 (45) PT.- THEY ARE OUT TO GET ME.
33200 (46) DR.- WHAT MAKES YOU THINK THEY ARE OUT TO GET YOU.
33300 .END
33400 The expression (45 ) is really a double idiom in which "out"
33500 means `intend' and "get" means `harm' in this context. Needless to
33600 say. an algorithm which tried to pair off the various meanings of
33700 "out" with the various meanings of "get" would have a hard time of
33800 it. But an algorithm which recognizes what it itself is capable of
33900 saying, can easily recognize echoed idioms.
34000
34100 .F
34200 FUZZ TERMS
34300
34400 In this category fall a large number of expressions which
34500 have little or no meaning and therefore can be ignored by the
34600 algorithm. The lower-case expressions in the following are examples
34700 of fuzz:
34800 .V
34900 (47) DR.- well now perhaps YOU CAN TELL ME something ABOUT
35000 YOUR FAMILY.
35100 (48) DR.- on the other hand I AM INTERESTED IN YOU.
35200 (49) DR.- hey I ASKED YOU A QUESTION.
35300 .END
35400 The algorithm has "ignoring mechanisms" which allows for for
35500 an `anything' slot in its pattern recognition. Fuzz term are thus
35600 easily ignored and no attempt is made to analyze them.
35700
35800 .F
35900 SUBORDINATE CLAUSES
36000
36100 A subordinate clause is a complete statement inside another
36200 statement. It is most frequently introduced by a relative pronoun,
36300 indicated in the following expressions by lower case:
36400 .V
36500 (50) DR.- WAS IT THE UNDERWORLD that PUT YOU HERE?
36600 (51) DR.- WHO ARE THE PEOPLE who UPSET YOU?
36700 (52) DR.- HAS ANYTHING HAPPENED which YOU DONT UNDERSTAND?
36800 .END
36900 One of the linguistic weaknesses of the model is that it
37000 takes the entire input as a single expression. When the input is
37100 syntactically complex, such as possessing subordinate clauses, the
37200 algorithm can become confused. To avoid this, future versions of the
37300 model will segment the input into more manageable phrases.
37400 .F
37500 VOCABULARY
37600
37700 How many words should there be in the algorithm's vocabulary?
37800 It is a rare human speaker of English who can recognize 40% of the
37900 415,000 words in the Oxford English Dictionary. In his everyday
38000 conversation an educated person uses perhaps 10,000 words and has a
38100 recognition vocabulary of about 50,000 words. A study of phone
38200 conversations showed that 96 % of the talk employed only 737 words.
38300 (French, Carter, and Koenig, 1930). Of course if the remaining 4% are
38400 important but unrecognized contentives,the result may be ruinous to
38500 he continuity of a conversation.
38600 In counting all the words in 53 teletyped psychiatric
38700 interviews conducted by psychiatrists, we found only 721 different
38800 words. Since we are familiar with psychiatric vocabularies and
38900 styles of expression, we believed this language-algorithm could
39000 function adequately with a vocabulary of at most a few thousand
39100 contentives. will always be unrecognized words. The algorithm must
39200 be able to continue even if it does not have a particular word in its
39300 vocabulary. This provision represents one great advantage of
39400 pattern-matching over conventional linguistic parsing. Our algorithm
39500 can guess while a parser must know with certainty in order to
39600 proceed.
39700
39800 .F
39900 MISSPELLINGS AND EXTRA CHARACTERS
40000 There is really no good defense against misspellings in a
40100 teletyped interview except having a human monitor the conversation
40200 and make the necessary corrections. Spelling correcting programs are
40300 slow, inefficient, and imperfect. They experience great problems
40400 when it is the first character in a word which is incorrect.
40500 Extra characters sent over the teletype by the interviewer or
40600 by a bad phone line can be removed by a human monitor since the
40700 output from the interviewer first appears on the monitor's console
40800 and then is typed by her directly to the program.
40900
41000 .F
41100 META VERBS
41200
41300 Certain common verbs such as "think", "feel", "believe", etc
41400 can take a clause as their ojects as in:
41500 .V
41600 (54) DR.- I THINK YOU ARE RIGHT.
41700 (55) DR.- WHY DO YOU FEEL THE GAMBLING IS CROOKED?
41800 .END
41900 The verb "believe" is peculiar since it can also take as
42000 object a noun or noun phrase as in:
42100 .V
42200 (56) DR.- I BELIEVE YOU.
42300 .END
42400 In expression (55) the conjunction "that" can follow the word
42500 "feel" signifying a subordinate clause. This is not the case after
42600 "believe" in expression (56). The model makes the correct distinction
42700 in (56) because nothing follows the "you".
42800 .F
42900 ODD WORDS
43000 From extensive experience with teletyped interviews, we
43100 learned the model must have patterns for "odd" words. We term them
43200 such since these are words which are quite natural in the usual
43300 vis-a-vis interview in which the participants communicate through
43400 speech but which are quite odd in the context of a teletyped
43500 interview. This should be clear from the following examples in which
43600 the odd words appear in lower case:
43700 .V
43800 (57) DR.-YOU sound CONFUSED.
43900 (58) DR.- DID YOU hear MY LAST QUESTION?
44000 (59) DR.- WOULD YOU come in AND sit down PLEASE?
44100 (60) DR.- CAN YOU say WHO?
44200 (61) DR.- I WILL see YOU AGAIN TOMORROW.
44300 .END
44400
44500
44600 .F
44700 MISUNDERSTANDING
44800
44900 It is perhaps not fully recognized by students of language
45000 how often people misunderstand one another in conversation and yet
45100 their dialogues proceed as if understanding and being understood had
45200 taken place.
45300 A classic example is the following man-on-the-street interview.
45400 .V
45500 INTERVIEWER - WHAT DO YOU THINK OF MARIHUANA?
45600 MAN - DIRTIEST TOWN IN MEXICO.
45700 INTERVIEWER - HOW ABOUT LSD?
45800 MAN - I VOTED FOR HIM.
45900 INTERVIEWER - HOW DO YOU FEEL ABOUT THE INDIANAPOLIS 500?
46000 MAN - I THINK THEY SHOULD SHOOT EVERY LAST ONE OF THEM.
46100 INTERVIEWER - AND THE VIET CONG POSITION?
46200 MAN - I'M FOR IT, BUT MY WIFE COMPLAINS ABOUT HER ELBOWS.
46300 .END
46400 Sometimes a psychiatric interviewer realizes when
46500 misunderstanding occurs and tries to correct it. Other times he
46600 simply passes it by. It is characteristic of the paranoid mode to
46700 respond idiosyncratically to particular word-concepts regardless of
46800 what the interviewer is saying:
46900 .V
47000 (62) PT.- SOME PEOPLE HERE MAKE ME NERVOUS.
47100 (63) DR.- I BET.
47200 (64) PT.- GAMBLING HAS BEEN NOTHING BUT TROUBLE FOR ME.
47300 .END
47400 Here one word sense of "bet" (to wager) is confused with the offered
47500 sense of expressing agreement. As has been emphasized, this property
47600 of paranoid conversation eases the task of simulation.
47700 .F
47800 UNUNDERSTANDING
47900
48000 A dialogue algorithm must be prepared for situations in which
48100 it simply does not understand i.e. it cannot arrive at any
48200 interpretation as to what the interviewer is saying since no pattern
48300 can be matched. An algorithm should not be faulted for a lack of
48400 facts as in:
48500 .V
48600 (65) DR.- WHO IS THE PRESIDENT OF TURKEY?
48700 .END CONTINUE
48800 when the data-base does not contain the word
48900 "Turkey". In this default condition it is simplest to reply:
49000 .V
49100 (66) PT.- I DONT KNOW.
49200 .END CONTINUE
49300 and dangerous to reply:
49400 .V
49500 (67) PT.- COULD YOU REPHRASE THE QUESTION?
49600 .END CONTINUE
49700 because of the disastrous loops which can result.
49800 Since the main problem in the default condition of
49900 ununderstanding is how to continue, the model employs heuristics such
50000 as changing the level of the dialogue and asking about the
50100 interviewer's intention as in:
50200 .V
50300 (68) PT.- WHY DO YOU WANT TO KNOW THAT?
50400 .END CONTINUE
50500 or rigidly continuing with a previous topic or introducing a new
50600 topic.
50700 These are admittedly desperate measures intended to prompt
50800 the interviewer in directions the algorithm has a better chance of
50900 understanding. Usually it is the interviewer who controls the flow
51000 from topic to topic but there are times when control must be assumed
51100 by the algorithm.
51200 There are many additional problems in understanding
51300 conversational language but the above description should be
51400 sufficient to convey some of the complexities involved. Further
51500 examples will be presented in the next chapter in describing the
51600 logic of the central processes of the model.